Dev/qwen omni #47

ngquangtrung57 · 2025-10-02T18:42:10Z

Motivation

Support Qwen 2.5 Omni

Modifications

Add Qwen 2.5 Omni
Add video backend: qwen_omni_utils

Checklist

Format your code
Add unit tests
Update documentation as needed, including docstrings or example tutorials.

kcz358 · 2025-10-09T08:50:44Z

src/lmms_engine/datasets/naive/vision_audio_dataset.py

Can probably refactor this changes into a new qwen omni dataset that inherit from this and override and keep vision_audio_dataset unchanged

kcz358 · 2025-10-09T08:52:23Z

src/lmms_engine/datasets/processor/base_qwen2_5_processor.py

Can override this logic in qwen2_5_omni_processor and keep this file unchange

kcz358 · 2025-10-09T08:55:57Z

src/lmms_engine/mapping_func.py

+    # for Qwen2.5-Omni, use ThinkerForConditionalGeneration
+    if config.model_type == "qwen2_5_omni":
+        from transformers.models.qwen2_5_omni.modeling_qwen2_5_omni import (
+            Qwen2_5OmniThinkerForConditionalGeneration,
+        )
+
+        class Qwen2_5OmniThinkerForConditionalGenerationWithDtype(
+            Qwen2_5OmniThinkerForConditionalGeneration
+        ):
+            @classmethod
+            def from_pretrained(cls, *args, **kwargs):
+                if "torch_dtype" not in kwargs:
+                    kwargs["torch_dtype"] = "auto"
+                model = super().from_pretrained(*args, **kwargs)
+                import torch
+
+                model = model.to(dtype=torch.bfloat16)
+                return model
+
+        return Qwen2_5OmniThinkerForConditionalGenerationWithDtype


Qwen2_5OmniThinkerForConditionalGeneration itself is a PretrainedModel and can use from_pretrained. This part is redundant. Register it as an auto causal lm and remove this

kcz358 · 2025-10-09T09:02:37Z

src/lmms_engine/models/qwen2_5_omni/monkey_patch.py

+    if rms_norm:
+        modeling_qwen2_5_omni.Qwen2RMSNorm = LigerRMSNorm
+    if cross_entropy:
+        modeling_qwen2_5_omni.CrossEntropyLoss = LigerCrossEntropyLoss
+    if fused_linear_cross_entropy:
+        modeling_qwen2_5_omni.Qwen2_5OmniThinkerForConditionalGeneration.forward = (
+            qwen2_5_omni_lce_forward
+        )
+    if swiglu:
+        modeling_qwen2_5_omni.Qwen2MLP = LigerSwiGLUMLP


Qwen2.5 Omni don't have Qwen2RMSNorm or Qwen2MLP. This won't patch the correct module. Check https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_5_omni/modeling_qwen2_5_omni.py for correct names

correct me again if im wrong but Qwen2.5Omni uses
Qwen2RMSNorm: https://github.com/huggingface/transformers/blob/0419ff881d7bb503f4fc0f0a7a5aac3d012c9b91/src/transformers/models/qwen2_5_omni/modeling_qwen2_5_omni.py#L46C36-L46C48
and also Qwen2MLP:
https://github.com/huggingface/transformers/blob/0419ff881d7bb503f4fc0f0a7a5aac3d012c9b91/src/transformers/models/qwen2_5_omni/modeling_qwen2_5_omni.py#L1401

I think I scan to this https://github.com/huggingface/transformers/blob/0419ff881d7bb503f4fc0f0a7a5aac3d012c9b91/src/transformers/models/qwen2_5_omni/modeling_qwen2_5_omni.py#L976-L987. Not sure if it can also be patched. Yeah but should be fined to patch these two with Qwen2RMSNorm and Qwen2MLP. Should be fine for now, no need to change for this part.

kcz358 · 2025-10-09T09:05:02Z

src/lmms_engine/models/qwen2_5_omni/qwen2_5_omni_ops.py

Not actually remove padding from the ops right now. Input ids will be 2D and I suspect you are directly using the Qwen2.5 VL 's forward? This could be buggy

kcz358 · 2025-10-09T09:06:37Z

src/lmms_engine/models/utils.py

+        elif config.model_type == "qwen2_5_omni":
+            self.config = (
+                config.text_config if hasattr(config, "text_config") else config
+            )


If already reset config here, actually no need to rewrite the _estimate_qwen2_flops later

ngquangtrung57 · 2025-10-12T17:19:48Z

Most rmpad logic falls under liger file not the ops file because Qwen Omni doesn't have intermediate model layer.
I have some issues with the register and loading of Thinker only, so i still modify the mapping_func file.
I have fixed the issue with tmrope that caused error with sp i mentioned, but currently with my testing using sp=2 on 8 GPUs, it ran successfully the first step. Then at the second step, it got nccl timeout due to AlltoAll operation. This could be my server issue because i have this same error for different projects, when the GPUs perform AllReduce or Allgather.

kcz358 · 2025-10-15T01:54:22Z

src/lmms_engine/mapping_func.py

+
+    # for Qwen2.5-Omni, load only the thinker model (not full model with talker), hence need handle differently otherwise it will load the full model
+    if hasattr(config, "model_type") and config.model_type == "qwen2_5_omni":
+
+        class Qwen2_5OmniThinkerLoader:
+            @staticmethod
+            def from_pretrained(pretrained_model_name_or_path, *args, **kwargs):
+                # Load the full model first
+                full_model = AutoModelForCausalLM.from_pretrained(
+                    pretrained_model_name_or_path, *args, **kwargs
+                )
+                # Extract and return only the thinker
+                thinker_model = full_model.thinker
+                # Clean up full model to save memory
+                if hasattr(full_model, "talker"):
+                    del full_model.talker
+                if hasattr(full_model, "token2wav"):
+                    del full_model.token2wav
+                del full_model
+                return thinker_model
+
+        return Qwen2_5OmniThinkerLoader
+


This part should be removed

kcz358

Hi, I think most of the parts look fine right now. A few suggestion that can improve this PR further. I see that we are only training the Thinker part so I think you can wrap the actual training model into a PretrainedModel and save if as a pretrained checkpoint. Then you can simply use from_pretrained using that and rename the model into qwen2_5_omni_thinker (can push the converted checkpoint to hub also, in lmms-lab maybe)

kcz358 · 2025-10-15T02:08:41Z

To clarify more, my feeling is that you can actually dump away all the talker module. You can do it as follow

You can see here that thinker module itself is a pretrainedmodel and has its own model type
https://github.com/huggingface/transformers/blob/94df0e65602922be2831b3faa457a2bde78b936b/src/transformers/models/qwen2_5_omni/configuration_qwen2_5_omni.py#L413-L484

So you can

First load pretrained from original Qwen2.5-Omni checkpoints
Extract the thinker module (Can be like model.thinker, model.thinker.thinker_config)
Save the pretrained thinker checkpoints, config, and preprocessor using save_pretrained
In the init part, register the thinker module as an AutoModelForCausalLM (it is not registered by the transformers right?)
Load from your pretrained checkpoints, processor, and config so that we have a very clean mapping func and register
[Optional] Push the checkpoints to the hub so that everyone can use it

You might need to change your patching logic accordingly though, because the model structure changes a bit

kcz358 · 2025-10-15T02:11:01Z

Most rmpad logic falls under liger file not the ops file because Qwen Omni doesn't have intermediate model layer.

I have some issues with the register and loading of Thinker only, so i still modify the mapping_func file.

I have fixed the issue with tmrope that caused error with sp i mentioned, but currently with my testing using sp=2 on 8 GPUs, it ran successfully the first step. Then at the second step, it got nccl timeout due to AlltoAll operation. This could be my server issue because i have this same error for different projects, when the GPUs perform AllReduce or Allgather.

For the sp alltoall ops, this could possibly caused by the tensor shape mismatch during the gather operations so it is stucked there. May need to check the splitted size on each rank and validate whether the gather size is being set correctly

Removed comment about using exist_ok=True in AutoConfig registration.

ngquangtrung57 · 2025-10-17T07:14:39Z

I have extracted the thinker only and pushed it here "ngqtrung/Qwen2.5-Omni-Thinker-7B". When i move it to lmms lab hub, it show "The storage patterns of org lmms-lab tripped our internal systems! Please contact us at [email protected] so we can verify your account and unlock more storage for your use-case." so currently, i keep it in my hub.
I have checked the tensor shape of AlltoAll op and it looks correct. also the hang happens after lce forward(after compte the loss) not when performing "gather_heads_scatter_seq". i think this could be related to trainer or as i said, my server itself.

kcz358 · 2025-10-20T01:42:21Z

Hi trung, thank you for all the efforts. I think the most of the part looks nice and LGTM. Can you resolve the conflict and change the example config into the new format? Since we are using hydra now, we can either choose to use a config yaml or simply overriding args with cli. Can check the examples/qwen3_vl for detail information. Thanks!

kcz358 · 2025-10-20T05:35:41Z

Please don't use merge because it will merge every commits into the main branch and causing a non-linear history. Instead use squash and merge. I will revert this PR for now and re PR this feature

kcz358 · 2025-10-20T05:42:28Z

Oh, actually on my testing envs, this also breaks some of my cicd in some cases. Let me see if we can fix it together in a new PR

ngquangtrung57 · 2025-10-20T05:56:48Z

Sure, i can help with the fixing. What the current issue?

kcz358 · 2025-10-20T06:23:04Z

It's okay, I fix the import error already

ngquangtrung57 added 7 commits September 29, 2025 01:18

first commit

198d4dc

working qwen omni on small test

8505f83

Fix freeze talker

a37ae7d

More fixing

1c83dc2

fix position ids bug causes OOM

0365e8c

Fix liger, update docs

14cbd6d

Add examples and lint

3197757

ngquangtrung57 requested a review from kcz358 October 2, 2025 18:46

ngquangtrung57 and others added 5 commits October 4, 2025 03:50

Fix use_audio_in_video

6492a5f

Fix handling audio input

4e7fe43

Merge branch 'main' into dev/qwen_omni

6c321a1

Update example to use hf_trainer

d738704

fix audio loading

56f47de

kcz358 reviewed Oct 9, 2025

View reviewed changes

ngquangtrung57 added 4 commits October 13, 2025 00:35

refactor and fix rmpad, sp

1132e8e

lint

1914dcd

fix example

6ee1fe9

minor fix

f6b85ce

ngquangtrung57 force-pushed the dev/qwen_omni branch from 61cc0cd to f6b85ce Compare October 12, 2025 17:03

Delete examples/qwen2_5_omni_test.yaml

85b8ba5

ngquangtrung57 requested a review from kcz358 October 12, 2025 17:22

kcz358 reviewed Oct 15, 2025

View reviewed changes

ngquangtrung57 and others added 3 commits October 17, 2025 15:04

Fix loading Qwen Omni

368814e

fix example

533988e

Update mapping_func.py

b37d93a

Removed comment about using exist_ok=True in AutoConfig registration.

ngquangtrung57 requested a review from kcz358 October 17, 2025 07:14

kcz358 approved these changes Oct 20, 2025

View reviewed changes

ngquangtrung57 added 2 commits October 20, 2025 11:53

Merge branch 'main' into dev/qwen_omni

40363a3

Refactor qwen2_5_omni.yaml configuration structure

bed8723

ngquangtrung57 merged commit 25aecdf into main Oct 20, 2025
2 checks passed

kcz358 mentioned this pull request Oct 20, 2025

feat: Support Qwen2.5 Omni Thinker #56

Merged

3 tasks

Dev/qwen omni #47

Dev/qwen omni #47

Uh oh!

Conversation

ngquangtrung57 commented Oct 2, 2025

Motivation

Modifications

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kcz358 Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngquangtrung57 commented Oct 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kcz358 left a comment

Choose a reason for hiding this comment

Uh oh!

kcz358 commented Oct 15, 2025

Uh oh!

kcz358 commented Oct 15, 2025

Uh oh!

ngquangtrung57 commented Oct 17, 2025

Uh oh!

kcz358 commented Oct 20, 2025

Uh oh!

Uh oh!

kcz358 commented Oct 20, 2025

Uh oh!

kcz358 commented Oct 20, 2025

Uh oh!

ngquangtrung57 commented Oct 20, 2025

Uh oh!

kcz358 commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kcz358 Oct 10, 2025 •

edited

Loading